An Accurate Method for Inferring Relatedness in Large Datasets of Unphased Genotypes via an Embedded Likelihood-Ratio Test
نویسندگان
چکیده
Studies that map disease genes rely on accurate annotations that indicate whether individuals in the studied cohorts are related to each other or not. For example, in genome-wide association studies, the cohort members are assumed to be unrelated to one another. Investigators can correct for individuals in a cohort with previously-unknown shared familial descent by detecting genomic segments that are shared between them, which are considered to be identical by descent (IBD). Alternatively, elevated frequencies of IBD segments near a particular locus among affected individuals can be indicative of a disease-associated gene. As genotyping studies grow to use increasingly large sample sizes and meta-analyses begin to include many data sets, accurate and efficient detection of hidden relatedness becomes a challenge. To enable diseasemapping studies of increasingly large cohorts, a fast and accurate method to detect IBD segments is required. We present PARENTE, a novel method for detecting related pairs of individuals and shared haplotypic segments within these pairs. PARENTE is a computationally-efficient method based on an embedded likelihood ratio test. As demonstrated by the results of our simulations, our method exhibits better accuracy than the current state of the art, and can be used for the analysis of large genotyped cohorts. PARENTE’s higher accuracy becomes even more significant in more challenging scenarios, such as detecting shorter IBD segments or when an extremely low false-positive rate is required. PARENTE is publicly and freely available at http://parente.stanford.edu/.
منابع مشابه
Efficient whole-genome association mapping using local phylogenies for unphased genotype data
MOTIVATION Recent advances in genotyping technology has made data acquisition for whole-genome association study cost effective, and a current active area of research is developing efficient methods to analyze such large-scale datasets. Most sophisticated association mapping methods that are currently available take phased haplotype data as input. However, phase information is not readily avail...
متن کاملModified signed log-likelihood test for the coefficient of variation of an inverse Gaussian population
In this paper, we consider the problem of two sided hypothesis testing for the parameter of coefficient of variation of an inverse Gaussian population. An approach used here is the modified signed log-likelihood ratio (MSLR) method which is the modification of traditional signed log-likelihood ratio test. Previous works show that this proposed method has third-order accuracy whereas the traditi...
متن کاملAccurate Inference for the Mean of the Poisson-Exponential Distribution
Although the random sum distribution has been well-studied in probability theory, inference for the mean of such distribution is very limited in the literature. In this paper, two approaches are proposed to obtain inference for the mean of the Poisson-Exponential distribution. Both proposed approaches require the log-likelihood function of the Poisson-Exponential distribution, but the exact for...
متن کاملNonlinear Vibration Analysis of Embedded Multiwalled Carbon Nanotubes in Thermal Environment
In this article, based on the Euler-Bernoulli beam theory, the large-amplitude vibration of multiwalled carbon nanotubes embedded in an elastic medium is investigated. The method of incremental harmonic balance is implemented to solve the set of governing nonlinear equations coupled via the van der Waals (vdW) interlayer force. The influences of number of tube walls, the elastic medium, nanotub...
متن کاملA performance assessment of relatedness inference methods using genome-wide data from thousands of relatives
Inferring relatedness from genomic data is an essential component of genetic association studies, population genetics, forensics, and genealogy. While numerous methods exist for inferring relatedness, thorough evaluation of these methods in real data has been lacking. Here, we report an assessment of 11 state-ofthe-art relatedness inference methods using a dataset with 2,485 individuals contain...
متن کامل